3
4
+--------+--------------------+---------------------+---------------+-------------+-------------------+------------------+----------+------------------+-------------------+------------------+------------+-----------+-----+-------+----------+------------+---------------------+------------+
|VendorID|tpep_pickup_datetime|tpep_dropoff_datetime|passenger_count|trip_distance| pickup_longitude| pickup_latitude|RateCodeID|store_and_fwd_flag| dropoff_longitude| dropoff_latitude|payment_type|fare_amount|extra|mta_tax|tip_amount|tolls_amount|improvement_surcharge|total_amount|
+--------+--------------------+---------------------+---------------+-------------+-------------------+------------------+----------+------------------+-------------------+------------------+------------+-----------+-----+-------+----------+------------+---------------------+------------+
| 2| 2015-01-15 19:05:39| 2015-01-15 19:23:42| 1| 1.59| -73.993896484375|40.750110626220703| 1| N|-73.974784851074219|40.750617980957031| 1| 12| 1| 0.5| 3.25| 0| 0.3| 17.05|
| 1| 2015-01-10 20:33:38| 2015-01-10 20:53:28| 1| 3.30| -74.00164794921875| 40.7242431640625| 1| N|-73.994415283203125|40.759109497070313| 1| 14.5| 0.5| 0.5| 2| 0| 0.3| 17.8|
| 1| 2015-01-10 20:33:38| 2015-01-10 20:43:41| 1| 1.80|-73.963340759277344|40.802787780761719| 1| N|-73.951820373535156|40.824413299560547| 2| 9.5| 0.5| 0.5| 0| 0| 0.3| 10.8|
| 1| 2015-01-10 20:33:39| 2015-01-10 20:35:31| 1| .50|-74.009086608886719|40.713817596435547| 1| N|-74.004325866699219|40.719985961914063| 2| 3.5| 0.5| 0.5| 0| 0| 0.3| 4.8|
| 1| 2015-01-10 20:33:39| 2015-01-10 20:52:58| 1| 3.00|-73.971176147460938|40.762428283691406| 1| N|-74.004180908203125|40.742652893066406| 2| 15| 0.5| 0.5| 0| 0| 0.3| 16.3|
+--------+--------------------+---------------------+---------------+-------------+-------------------+------------------+----------+------------------+-------------------+------------------+------------+-----------+-----+-------+----------+------------+---------------------+------------+
only showing top 5 rows
6
12748986
7
['VendorID',
'tpep_pickup_datetime',
'tpep_dropoff_datetime',
'passenger_count',
'trip_distance',
'pickup_longitude',
'pickup_latitude',
'RateCodeID',
'store_and_fwd_flag',
'dropoff_longitude',
'dropoff_latitude',
'payment_type',
'fare_amount',
'extra',
'mta_tax',
'tip_amount',
'tolls_amount',
'improvement_surcharge',
'total_amount']
8
DataFrame[summary: string, VendorID: string, tpep_pickup_datetime: string, tpep_dropoff_datetime: string, passenger_count: string, trip_distance: string, pickup_longitude: string, pickup_latitude: string, RateCodeID: string, store_and_fwd_flag: string, dropoff_longitude: string, dropoff_latitude: string, payment_type: string, fare_amount: string, extra: string, mta_tax: string, tip_amount: string, tolls_amount: string, improvement_surcharge: string, total_amount: string]
9
root
|-- VendorID: string (nullable = true)
|-- tpep_pickup_datetime: string (nullable = true)
|-- tpep_dropoff_datetime: string (nullable = true)
|-- passenger_count: string (nullable = true)
|-- trip_distance: string (nullable = true)
|-- pickup_longitude: string (nullable = true)
|-- pickup_latitude: string (nullable = true)
|-- RateCodeID: string (nullable = true)
|-- store_and_fwd_flag: string (nullable = true)
|-- dropoff_longitude: string (nullable = true)
|-- dropoff_latitude: string (nullable = true)
|-- payment_type: string (nullable = true)
|-- fare_amount: string (nullable = true)
|-- extra: string (nullable = true)
|-- mta_tax: string (nullable = true)
|-- tip_amount: string (nullable = true)
|-- tolls_amount: string (nullable = true)
|-- improvement_surcharge: string (nullable = true)
|-- total_amount: string (nullable = true)
10
+-------+------------------+------------------+------------------+
|summary| trip_distance| passenger_count| fare_amount|
+-------+------------------+------------------+------------------+
| count| 12748986| 12748986| 12748986|
| mean|13.459129611562718|1.6814908260154964|11.905659425776989|
| stddev| 9844.094218468374|1.3379235172874737|10.302537135952232|
| min| .00| 0| -0.01|
| max| 99.90| 9| 999.99|
+-------+------------------+------------------+------------------+
12
+-------+------------------+--------------------+---------------------+------------------+------------------+-------------------+------------------+------------------+------------------+-------------------+-------------------+------------------+------------------+-------------------+------------------+------------------+-------------------+---------------------+------------------+
|summary| VendorID|tpep_pickup_datetime|tpep_dropoff_datetime| passenger_count| trip_distance| pickup_longitude| pickup_latitude| RateCodeID|store_and_fwd_flag| dropoff_longitude| dropoff_latitude| payment_type| fare_amount| extra| mta_tax| tip_amount| tolls_amount|improvement_surcharge| total_amount|
+-------+------------------+--------------------+---------------------+------------------+------------------+-------------------+------------------+------------------+------------------+-------------------+-------------------+------------------+------------------+-------------------+------------------+------------------+-------------------+---------------------+------------------+
| count| 12748986| 12748986| 12748986| 12748986| 12748986| 12748986| 12748986| 12748986| 12748986| 12748986| 12748986| 12748986| 12748986| 12748986| 12748986| 12748986| 12748986| 12748983| 12748986|
| mean|1.5214373127400094| NULL| NULL|1.6814908260154964|13.459129611562718| -72.56183777902534| 39.97282304763482|1.0369007386156044| NULL| -72.60903923063492| 39.9996144802455|1.3867115392549652|11.905659425776989|0.30827895724412907|0.4977986092384132|1.8538136460419994|0.24349839430352666| 0.28314307893811447|15.108294537401271|
| stddev|0.4995402498256225| NULL| NULL|1.3379235172874737| 9844.094218468374| 10.125103592972911| 5.5786905190884|0.6732239779497589| NULL| 9.96603703803103| 5.48774188661968|0.4988610635053929|10.302537135952232| 0.5916643112912818|0.0353422867098315|1106.4323141838747| 1.5271714003797854| 0.06908632935830779| 1106.503246710499|
| min| 1| 2015-01-01 00:00:00| 2015-01-01 00:00:00| 0| .00|-1.3151819705963135| 0| 1| N|-0.1166670024394989|-9.0291566848754883| 1| -0.01| -0.09| -0.5| -0.01| -10.66| 0| -0.31|
| max| 2| 2015-01-31 23:59:59| 2016-02-02 16:30:52| 9| 99.90| 78.662651062011719|9.5878467559814453| 99| Y| 85.274024963378906| 9.9809532165527344| 5| 999.99| 999.99| 0.5| 99.96| 999.99| 0.3| 99.99|
+-------+------------------+--------------------+---------------------+------------------+------------------+-------------------+------------------+------------------+------------------+-------------------+-------------------+------------------+------------------+-------------------+------------------+------------------+-------------------+---------------------+------------------+
13
+-------+------------------+--------------------+---------------------+------------------+------------------+-------------------+-------------------+------------------+------------------+-------------------+------------------+------------------+------------------+-------------------+--------------------+------------------+------------------+---------------------+------------------+
|summary| VendorID|tpep_pickup_datetime|tpep_dropoff_datetime| passenger_count| trip_distance| pickup_longitude| pickup_latitude| RateCodeID|store_and_fwd_flag| dropoff_longitude| dropoff_latitude| payment_type| fare_amount| extra| mta_tax| tip_amount| tolls_amount|improvement_surcharge| total_amount|
+-------+------------------+--------------------+---------------------+------------------+------------------+-------------------+-------------------+------------------+------------------+-------------------+------------------+------------------+------------------+-------------------+--------------------+------------------+------------------+---------------------+------------------+
| count| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464| 9474464|
| mean|1.5275759135292508| NULL| NULL|1.6921189420319713|17.829444945907543| -73.97325995357461| 40.750038711870616| 1.034897910847516| NULL| -73.97297563495772| 40.75096497538369| 1.35772577741601|13.910922983084133| 0.3140457919308153| 0.4985670429482871|2.2235511032600423|0.3158089069733504| 0.2828745562794641|17.562448306351715|
| stddev|0.4992390162031469| NULL| NULL|1.3442813854320428|11419.212112398436|0.07848886714950458|0.12801451565479846|0.5417347281929257| NULL| 0.3861348522326615|0.1887650711938517|0.4870103396253098|10.313916600093934|0.45631840124394124|0.026728734263891794|1283.4682079132479|1.5335839535427085| 0.06960138507560257| 1283.531044935015|
| min| 1| 2015-01-01 00:00:00| 2015-01-01 00:00:00| 0| 1.00|-1.3151819705963135| 18.625944137573242| 1| N|-1.2284070253372192|18.625944137573242| 1| 1| -0.4| 0| 0| 0| 0| -69.7|
| max| 2| 2015-01-31 23:59:59| 2015-02-15 21:09:34| 9| 99.90| 78.662651062011719| 9.5878467559814453| 99| Y| 78.662651062011719|9.9809532165527344| 4| 99.99| 7| 0.5| 98.65| 95.5| 0.3| 99.99|
+-------+------------------+--------------------+---------------------+------------------+------------------+-------------------+-------------------+------------------+------------------+-------------------+------------------+------------------+------------------+-------------------+--------------------+------------------+------------------+---------------------+------------------+
14
+--------+--------------------+---------------------+---------------+-------------+-------------------+------------------+----------+------------------+-------------------+------------------+------------+-----------+-----+-------+----------+------------+---------------------+------------+-------------------+-------------------+
|VendorID|tpep_pickup_datetime|tpep_dropoff_datetime|passenger_count|trip_distance| pickup_longitude| pickup_latitude|RateCodeID|store_and_fwd_flag| dropoff_longitude| dropoff_latitude|payment_type|fare_amount|extra|mta_tax|tip_amount|tolls_amount|improvement_surcharge|total_amount| pickup_datetime| dropoff_datetime|
+--------+--------------------+---------------------+---------------+-------------+-------------------+------------------+----------+------------------+-------------------+------------------+------------+-----------+-----+-------+----------+------------+---------------------+------------+-------------------+-------------------+
| 2| 2015-01-15 19:05:39| 2015-01-15 19:23:42| 1| 1.59| -73.993896484375|40.750110626220703| 1| N|-73.974784851074219|40.750617980957031| 1| 12| 1| 0.5| 3.25| 0| 0.3| 17.05|2015-01-15 19:05:39|2015-01-15 19:23:42|
| 1| 2015-01-10 20:33:38| 2015-01-10 20:53:28| 1| 3.30| -74.00164794921875| 40.7242431640625| 1| N|-73.994415283203125|40.759109497070313| 1| 14.5| 0.5| 0.5| 2| 0| 0.3| 17.8|2015-01-10 20:33:38|2015-01-10 20:53:28|
| 1| 2015-01-10 20:33:38| 2015-01-10 20:43:41| 1| 1.80|-73.963340759277344|40.802787780761719| 1| N|-73.951820373535156|40.824413299560547| 2| 9.5| 0.5| 0.5| 0| 0| 0.3| 10.8|2015-01-10 20:33:38|2015-01-10 20:43:41|
| 1| 2015-01-10 20:33:39| 2015-01-10 20:52:58| 1| 3.00|-73.971176147460938|40.762428283691406| 1| N|-74.004180908203125|40.742652893066406| 2| 15| 0.5| 0.5| 0| 0| 0.3| 16.3|2015-01-10 20:33:39|2015-01-10 20:52:58|
| 1| 2015-01-10 20:33:39| 2015-01-10 20:53:52| 1| 9.00|-73.874374389648438| 40.7740478515625| 1| N|-73.986976623535156|40.758193969726563| 1| 27| 0.5| 0.5| 6.7| 5.33| 0.3| 40.33|2015-01-10 20:33:39|2015-01-10 20:53:52|
+--------+--------------------+---------------------+---------------+-------------+-------------------+------------------+----------+------------------+-------------------+------------------+------------+-----------+-----+-------+----------+------------+---------------------+------------+-------------------+-------------------+
only showing top 5 rows
15
+-------------------+-------------------+------------------+------------------+
| pickup_datetime| dropoff_datetime| trip_duration_min| trip_speed_mph|
+-------------------+-------------------+------------------+------------------+
|2015-01-15 19:05:39|2015-01-15 19:23:42| 18.05| 5.285318559556787|
|2015-01-10 20:33:38|2015-01-10 20:53:28|19.833333333333332| 9.983193277310924|
|2015-01-10 20:33:38|2015-01-10 20:43:41| 10.05|10.746268656716417|
|2015-01-10 20:33:39|2015-01-10 20:52:58|19.316666666666666| 9.318377911993098|
|2015-01-10 20:33:39|2015-01-10 20:53:52|20.216666666666665|26.710634789777412|
+-------------------+-------------------+------------------+------------------+
only showing top 5 rows
17
+---------------+------------------+------------------+
|passenger_count| avg_fare| avg_trip_distance|
+---------------+------------------+------------------+
| 3|14.042959587975478| 3.502365063959862|
| 0|12.731226210551675| 2.901132257287401|
| 5|14.038445848844917| 3.540712301877328|
| 6| 13.88652208851831| 3.472525537512198|
| 1|13.790829828712587| 19.80164590310544|
| 4| 14.0730966302188| 3.509825008583094|
| 2|14.390632541239809|23.563276703570445|
| 9| 69.7| 15.962|
| 7| 15.6| 4.28|
| 8| 33.5| 7.263333333333333|
+---------------+------------------+------------------+
18
+-----------+------+
|pickup_hour| count|
+-----------+------+
| 19|592172|
| 18|584945|
| 20|557565|
| 21|554629|
| 22|544294|
| 17|489310|
| 23|473549|
| 14|470190|
| 15|465655|
| 13|450447|
| 12|447838|
| 11|422559|
| 16|417757|
| 9|407165|
| 10|402276|
| 8|400821|
| 0|375346|
| 7|341971|
| 1|284178|
| 2|214544|
+-----------+------+
only showing top 20 rows
19
+-------------------+------------------+--------+
| pickup_longitude| pickup_latitude|avg_fare|
+-------------------+------------------+--------+
|-73.961532592773438|40.770637512207031| 4008.0|
|-73.950325012207031|40.752861022949219| 800.0|
|-73.942741394042969|40.790802001953125| 780.0|
|-73.950279235839844|40.777347564697266| 760.01|
|-73.826431274414063|40.833961486816406| 600.0|
|-73.925872802734375|40.743618011474609| 525.0|
|-73.807296752929688|40.656135559082031| 489.5|
|-73.993026733398437|40.757881164550781| 467.54|
| -73.9478759765625|40.583606719970703| 450.0|
|-73.977920532226563| 40.7623291015625| 448.0|
|-73.974945068359375|40.760028839111328| 440.0|
|-74.000579833984375|40.722129821777344| 435.0|
|-73.781707763671875|40.644550323486328| 434.5|
|-73.789047241210937|40.647251129150391| 420.0|
| -73.98858642578125|40.768974304199219| 414.44|
|-73.873367309570313|40.774147033691406| 405.0|
|-73.973373413085937|40.746353149414063| 400.0|
|-73.789115905761719| 40.6422119140625| 375.5|
|-73.788642883300781|40.641929626464844| 370.0|
|-73.978889465332031|40.749427795410156| 370.0|
+-------------------+------------------+--------+
only showing top 20 rows
21
1,081+ rows|
22
24 rows
27
+-------------------+-------------------+------------------+
| pickup_datetime| dropoff_datetime| trip_duration_min|
+-------------------+-------------------+------------------+
|2015-01-15 19:05:39|2015-01-15 19:23:42| 18.05|
|2015-01-10 20:33:38|2015-01-10 20:53:28|19.833333333333332|
|2015-01-10 20:33:38|2015-01-10 20:43:41| 10.05|
|2015-01-10 20:33:39|2015-01-10 20:52:58|19.316666666666666|
|2015-01-10 20:33:39|2015-01-10 20:53:52|20.216666666666665|
+-------------------+-------------------+------------------+
only showing top 5 rows
29
+-------------------+-----------+----------+
| pickup_datetime|pickup_hour|pickup_day|
+-------------------+-----------+----------+
|2015-01-15 19:05:39| 19| 15|
|2015-01-10 20:33:38| 20| 10|
|2015-01-10 20:33:38| 20| 10|
|2015-01-10 20:33:39| 20| 10|
|2015-01-10 20:33:39| 20| 10|
+-------------------+-----------+----------+
only showing top 5 rows
31
+--------------------+
|max(pickup_datetime)|
+--------------------+
| 2015-01-31 23:59:59|
+--------------------+
33
+-----------+------------------+
|pickup_hour| avg_trip_duration|
+-----------+------------------+
| 0|15.094587482838069|
| 1|14.884602197683598|
| 2|15.016721822407833|
| 3|15.335224492536504|
| 4|15.069514610764612|
| 5|15.088893829183478|
| 6|14.159144579124913|
| 7|15.208690064362195|
| 8| 16.47289180623437|
| 9|16.393476600395424|
| 10|16.045915407995178|
| 11| 15.94066714155104|
| 12|16.122697679369175|
| 13| 16.43576062592638|
| 14| 17.52447918217457|
| 15| 18.01168547529826|
| 16| 17.01756547466589|
| 17|16.594882078845707|
| 18|15.898487607096948|
| 19|14.822664755059892|
+-----------+------------------+
only showing top 20 rows
35
50+ rows|
50+ rows|
37
10,000+ rows|
39
Correlation between trip duration and trip distance: 1.5306018579757404e-05
Correlation between trip duration and fare amount: 0.011440083866136857
42